Please see the video here on how to download repository and get set up https://www.youtube.com/watch?v=MIxZdjMa5eo
R markdown is great, and when I say great, I mean REALLY great.
When exporting your markdown you can…
(1) embed figures
(2) view tables
(3) export models
(4) archive your data pipeline
You can write code and execute functions in a similar way to normal R script files. However, there are a few distinct differences between R scripts and R markdown scripts. Some of these differences relate to how code is executed. Other differences can be seen in the added benefit of integrating scripts with outputs and other embedded properties.
The benefit of using markdown with colleagues is they can actually see the script, the figures, the path of model selection, even the assumptions of ANOVA (QQ plots, etc) that normally are hidden behind the curtain.
In short, R markdown helps make your science repeatable, sharable, and data analysis coherent!
Here are some common commands and examples of what Markdown can do for you. I reccommend visiting the [R Bookdown for a complete and exhaustive guide](https://bookdown.org/yihui/rmarkdown/ or the R markdown cheat sheet
If you want to test yourself, the R Markdown tutorial is another great guide.
For the advanced Markdown-ers, you can also write and code a manuscript in markdown that can be exported to Word, with references embedded. Markdown magic! R markdown: Publsihing workflows and manuscript.
Before we get started, let us discuss version control and software carpentry. It is increasingly important for you to back up and secure your data. This is not only true for your final data but also the versions of your data on the way to the “ultimate” data version. Think of this as a software update. First, you need to have a version of the software; this version may change as your team/PI/committee or others as for changes to your code. You can then update your code to reflect new needs, while also maintaining an archive or what the code WAS. This is in effect: version control. It helps you back up your work and maintain an archive of where you were and where you are.
Now, carpentry. Having a simple, clear directory of data folders is critical for making your data structure navigable. This is general house keeping for your directory. It will make your code, output, and products flow and keep your directory clean and organized.
Is this all above your head? That’s okay. See some open access resources here to help you get started in R (notice the pages are all R-Markdown html objects!) and on the path to coding. R data carpentry course.
My first advise to you would be to set up a GitHub account. Having a GitHub (or other online repo) allows you to have all your code, markdown, figures, data, etc archived online for safety and version control. You can make these repositories private or collaborative. It is easy! There are a lot of resources out there on why/how to use GitHub, and you’ll see why it is important as you advance your carpentry–you can read about how painless this is.
Whether you choose to GIT with it, or stick with running from your device, you should start by making an R project in R-Studio. You can read all about how to do this at happygitwithR.com.
In R Studio you can run a Project from your local directory or make one with version control by clicking the option to make an R project with version control will set you on the path to linking with GitHub (if you go this route, see happygitwithR.com first, as you must set up a github account, then make a repository, then use the repository URL to link to your R project). There are a few more steps to integrating R and github now, since github has moved to two-factor authentication, but you can see the how to and some trouble shooting in the linked sites.
If you decide to instead without version control, then set up the project wherever you wish on your computer. When you set your “R Project” directory, R Studio will generate a folder that houses your R Project. From here you can create your Rmd file and folder directory as you wish. For best practices, I recommend setting up a directory for each project (see directory for this project below).
From here, every time you open want to work on this project in R studio, click on the Project. This will allow you to load data in a simple way that anyone can understand (i.e., upload file “symbiont.csv”, from folder “data”)–and it is easy to share this structure with collaborators.
Yet another reason to use R Projects: it will eliminate the pesky and sometimes overly personal directories we all have on our desktop aka: “~/Desktop/PhD/WTFproject/NeverGoingToGetPublished/data.csv”
Figure 1. Data Directory example
Notice the script at the very top of your markdown (this is not standard, but is customized)
output:
html_document:
code_folding: hide
toc: yes
toc_depth: 3
toc_float: yes ***
Let’s go through this line by line…
html_document: an option you originally set when you open the markdown. This is how your file will be exported.code_folding: hide will allow you to show or hide all code (option at top of file)toc: for table of contents on the lefttoc_depth: how many header levels you will have in table of contentstoc_float: lets your contents move as you advance in your document.Play around with these options, and see how they affect your code
There are other options too, such as number_sections: true to add numbering to your sections.
First, if you want to make a new chunk (where code is written), use the shortcut! Control + Option + I" will generate a new chunk for you
r setup: This is your set up code and is a useful way to get around the fact that knitr will look for your .Rmd file and all files in the this directory. By setting the root.dir you can force knitr to look for files in the directory you specify and the folders within.
In the setup chunk (the first code chunk) you can set ‘global’ options, such as message/warning exclusion, or hiding results or outputs.
knitr::opts_chunk$set(warning=FALSE, message=FALSE): This command here is requiring the package knitr in the code chunk (‘set’ for set up) and is saying to “make all warning and message FALSE” i.e., hide them from output.
include=FALSE: this command makes your code absent from final html. It is executed but the code is hidden.
eval=FALSE: this command allows you to show the code in your script within R studio, but not evaluate (or run) the code. It is effectively silent in your analysis and output html.
echo=FALSE: this will show the output but not the code chunk… notice the difference.
results='hide': this will hide all results in a code chunk, or any returned results from your commands.
collapse = TRUE: this is one of my new favorites. You enter this to your chunk options for a code chunk to run all the way through and not separate into pieces of results. This is usueful if you have lots of outputs (even if they are hidden using results='hide', the code chunk will still give breaks where results would be. Collapse = TRUE stops this.)
It is useful to understand how to modify text and format headings. You can do this in a variety of ways.
line break: this is executed by two spaces at the end of previous line, and a return
headings: use # for headings, #…. #### and so on. # is largest and #### smallest heading
Italics comes from two calls italics or italics; bold is done the same way bold and bold
Superscript superscript2 or strikethroughs strikethrough can also be useful text modifiers.
Subscript subscript2 are another useful one for chemistry CO2, NO3-, and the like…
add links to urls like this R studio link
endash : –
emdash: —
ellipsis: …
inline equation: \(A = \pi*r^{2}\) or \(y = a*x+ b\)
(a line can inserted with ***)
using the > before and after a line can give you a quote/emphasis, such as…
“Its Van Halen, not Van Hagar!”
- Garth Algar
Code chunks are where your code is executed. If you do not set the working directory in setup each code chunk will revert to its original directory, or where your .Rmd file lives. This is one more reason why you should run everything out of your R Project.
## love your data and it will love you back
getwd() #where are your files? See how they should all be running from your R project in the directory
## [1] "/Users/chriswall/Desktop/Research and Teaching/github/Intro to Markdown"
Below: This is a code chunk, and this is how you enter your data into R markdown. Note the code chunks always start with a ```{r...} and ends with a {...}. You can add these chunks with the shortcut Control + Option + I".
The code chunk below is an example of attaching the data (.csv), familiar to you R-heads. Since the data file is in the directory specified by knitr::opts_knit$set(root.dir =... above, you can reference the .csv easily.
################################################
# import data, observe structure
################################################
# data file is in the folder 'data', within main working directory
data<-read.csv("data/coral_data.csv")
# set factor levels
cols <- c("Time.point", "Period", "Site", "Species", "Status", "Sample.ID")
data<-data %>% mutate_at(cols, funs(factor(.)))
head(data)
## Time.point Period Site Species Status Sample.ID Pair Depth.ft biomass
## 1 2014 Oct Bleaching HIMB MC B 3 2 0.6666667 21.22296
## 2 2014 Oct Bleaching HIMB MC NB 4 2 0.6666667 27.31830
## 3 2014 Oct Bleaching HIMB MC B 5 3 1.0000000 19.74599
## 4 2014 Oct Bleaching HIMB MC NB 6 3 1.0000000 15.44902
## 5 2014 Oct Bleaching HIMB PC B 9 5 1.6666667 26.27286
## 6 2014 Oct Bleaching HIMB PC NB 10 5 1.6666667 23.45818
## chla
## 1 2.57168099
## 2 4.76561135
## 3 0.25859937
## 4 2.67885949
## 5 0.04306378
## 6 7.40960243
You can include figures from script output as results, or figures from files in your directory. First, let’s see how you can add an image from a file to your markdown.